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SPLIT TREE DATA STRUCTURE 
Technical Field of the Invention 



The present invention relates generally to document formatting and, in particular, 
to document formatting using input trees. More specifically, the present invention relates 
5 to a method and apparatus for creating a split tree by assigning marks to an input tree and 
for generating tree fragments from the split tree. The invention also relates to a computer 
program product including a computer readable medium having recorded thereon a 
computer program for creating the split tree by assigning marks to the input tree and for 
generating tree fragments from the split tree. 

1 0 Background Art 

High-level languages for document formatting exist in many forms, some of 
which accept as input a description of a document in the form of an input tree. An 
example of such an input tree is presented in Fig. 1. The input tree comprises two main 
types of elements, namely data elements and instruction elements, those elements in Fig. 

15 1 being indicated in Fig. 2. The data elements contain data to be formatted. Each data 
element can include text or images. Instruction elements contain information used by the 
formatting algorithm to arrange the data element on the pages when producing an output. 
As seen in Fig. 3, the data elements form leaf nodes in the tree whereas the instmction 
elements form nodes. Further, nodes A and B are child nodes of node Cat(2). The left 

20 node. Node A in the example, is defined as the first child, whereas the right node. Node 
B, is defined as the second child. 

Instruction elements can be thought of as operators that define the output 
positions of data elements. There are a number of instruction elements or operators 
available. Certain of the instruction elements, for example those defining text size, text 

25 spacing and the like, typically only have one child. 
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The other relevant instruction in Figs. 1 to 3 is "Cat( )", which performs vertical 
concatenation. Each concatenation instruction places its left parameter (the first child) 
above its right parameter (second child). For example, Cat(2) would place the content of 
data element B below the content of data element A. 
5 As an input, data elements do not have an actual position. The actual position of 

the data elements is determined by a document formatter by evaluating the input tree. In 
the example of Fig. 1, the formatter, following the directions given by the instruction 
elements, determines that the result of the Cat(2) operator is placed above the result of the 
Cat(3) operator. The Cat(2) operator in tum places the content of data element A above 

10 data element B, and the Cat(3) operator places the content of data element C above data 
element D. The result is that data element A, B, C and D are stacked in that order. 

However, when the data elements are placed on a page in order to produce an 
output, it is sometimes found that the result is too big to fit on one page. Therefore, 
following the order that data elements are placed in, should the next data element not fit 

1 5 into the space left on the present page, the remainder of the data elements are promoted to 
the next page. In the example presented in Fig. 1, only data elements A, B and C are 
located on page 1 and the data element D is located on page 2. The result of the 
evaluation of the input tree is shovm in Fig. 4. The formatter also positions the data 
elements on the pages. 

20 To better explain the above, a galley and a galley target need to be defined. A 

galley is a part of the input tree, whereas a galley target is a placeholder for a galley. The 
galley is therefore not evaluated at a position in the input tree but rather at the point where 
it is placed into the galley target. The galley target is a stretchable area on the output area 
with size 0 when no galley is targeted into it. When a galley is targeted into a galley 

25 target, the galley target stretches to accommodate for the incoming galley. The galley is 
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unable to stretch infinitely though. There are restrictions on the maximum size of each 
galley target. Should it happens that the galley target reaches its maximum size before 
the galley is exhausted, a new galley target is found for the remainder of the content of 
the galley and the process of accommodating the remainder of the galley into its the new 
5 target is continued. 

Placing a galley into a series of targets involves splitting the input tree into 
fragments, each fragment filling its target. Fig. 5 shows another example of a galley 
(input tree). It is assumed that the galley in the example of Fig. 5 is targeted into a 
sequence of targets. Only data elements A and B fit into the first galley target in the 

1 0 sequence by taking all the available space defined by the restrictions on the first galley 
target. The rest of the galley is promoted to the following galley target in the sequence. 

Figs 6 A and 6B show the result after a portion of the galley of Fig. 5 is placed 
into the sequence of galley targets. The original galley tree shown in Fig. 5 is split into 
the two tree fragments. The first fragment, shown in Fig. 6A, is placed into the first 

15 galley target. The remainder of the galley, shown in Fig. 6B, is waiting to be placed into 
the following galley targets. Thus, the original galley was split into two separate smaller 
trees. 

The arrangement described above is used in batch text formatters which produce 
an output from an input tree. However, this arrangement suffers from the disadvantage 
20 that once the galley is split into smaller trees, it is hard to relate the fragmented trees back 
to their original galley tree. This is for example necessary whenever the order or contents 
of the data objects are changed, thus preventing interactive formatting. 

Disclosure of the Invention 
It is an object of the present invention to substantially overcome, or at least 
25 ameliorate, one or more disadvantages of existing arrangements. 
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According to a first aspect of the invention, there is provided a method of 
creating a split tree for representing both an input tree and at least one tree fragment, 
wherein said input tree comprises a plurality of nodes, said method comprising the steps 

of: 

5 determining which of said plurality of nodes fit into a galley target; and 

marking said nodes that fit into said galley target with a mark to create said tree 
fragment for each galley target, said marked input tree being said split tree. 

According to a second aspect of the invention, there is provided a method of 
splitting a split tree into at least one tree fragments, wherein each one of said tree 
1 0 fragments is associated with a mark and said split tree comprises at least one node marked 
said mark, said method comprising the steps of: 

identifying said at least one node with said mark; and 

creating said at least one tree fragment from said nodes marked with each of 
said marks. 

1 5 According to another aspect of the invention, there is provided an apparatus for 

implementing any one of the aforementioned methods. 

According to another aspect of the invention there is provided a computer 
program product including a computer readable medium having recorded thereon a 
computer program for implementing any one of the methods described above. 
20 Brief Description of the Drawings 

A number of preferred embodiments of the present invention will now be 
described with reference to the drawings, in which: 
Fig. 1 is an example of an input tree; 
Fig. 2 illustrates types of elements in the input tree; 
25 Fig. 3 illustrates nodes, leaf nodes and edges in the input tree; 
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Fig. 4 shows the result of evaluation of the input tree; 
Fig. 5 shows an original galley before splitting; 

Figs. 6A and 6B show the result of splitting the original galley of Fig. 5 into 
smaller fragments; 
5 Fig. 7 shows a split tree; 

Fig. 8 shows the split tree of Fig. 7 after changes; 

Fig. 9 is a flow diagram of a method of generating a split tree; 

Fig. 10 is a flow diagram of a method of creating tree fragments from a split 

tree; 

1 0 Fig. 1 1 is a flow diagram of a method of finding a next node; 

Fig. 12 is a flow diagram of a method of building a tree fragment; 
Fig. 13 is a schematic block diagram of a general purpose computer upon which 
the preferred embodiment of the present invention can be practiced. 

Detailed Description including Best Mode 
1 5 In the preferred embodiment of the present invention, the actual splitting of the 

galley tree, as discussed above, does not take place. This allows the structure of the 
galley to remain intact. The advantage of preserving the original structure of the galley 
tree is that it allows interactive formatting. For instance, it makes it easier to re-send the 
galley into its targets after the galley or its target have been modified. The re-sending of 
20 the galley usually involves moving the splitting boundary between the tree fragments. 

The preferred embodiment discloses that a galley is represented by a Split Tree 
data structure. The Split tree data structure allows a split tree to represent both the galley 
tree and the fragments of the galley tree that are the result of splitting the galley. The 
advantage is that instead of actually splitting the galley tree, marks are added to the tree. 
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The marks allow an algorithm operating on the split tree to interpret the split tree in the 
same way as tree fragments would be interpreted. 

The split tree corresponding to the two galley fragments in the above example of 
Fig. 5 is presented in Fig. 7. The fragment of the galley tree that is placed into the first 
5 galley target is marked with markers ' T. The fragment that did not fit into the first galley 
target is marked with marker '0'. The marker '0' is a special marker that marks nodes 
that are still waiting to be placed into galley targets. 

The split tree data structure allows each tree fragment in the split tree to be 
marked with a unique mark. In the preferred embodiment, integers are used as marks. 
10 Fig. 8 shows another example of a split tree. The galley of Fig. 7 is placed into a 

sequence of galley targets. For this example however, the three data elements A, B as 
well as C can fit into the first galley target. This change can be a result of a change in the 
available space for the first galley target, data elements A, B and/or C. 

Each node in the split tree can have two, one or no children. The nodes that have 
15 no children are also called leaf nodes. The criteria for interpreting the split tree are as 
follows: 

• If a node is marked with a marker 'n', the whole subtree attached to that node belongs 
to the 'n' fragment. 

• If both the left and right subtrees of a node contain nodes marked with 'n', the node 
20 belongs to the 'n' fragment. This criterion has further implications. If a node does 

not contain nodes marked with the same marker in its left and right subtrees, then 
such a node does not belong to any ft-agment and is referred to as an alienated node. 
Any marked node, therefore a node that is not an alienated node, belongs to one tree 
fragment only. 
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• If a node has only one child and the child subtree has a node marked with 'n', then the 
node belongs to the 'n' fragment. 

A method 30 of adding split marks to a split tree is presented in Fig. 9. The 
purpose of the method 30 is to place the galley tree into its galley targets. More precisely, 
5 it generates marks on the galley tree, the marks defining which parts of the galley tree are 
placed into which galley target. 

In step I, the method 30, called GST, accepts three input parameters: 

• node - The subtree defined by 'node' that is currently being pushed into a galley 
target; 

10 • available_space - The space available in the galley target for 'node'; and 

• mark - The mark used to mark all nodes that are pushed into a galley target. 

In step 3 it is determined whether 'node' is already marked. If 'node' is marked, 
it means that it has already been placed into some other target and the subtree defined by 
'node' is therefore ignored. The method 30 is then terminated in step 4 after returning 
1 5 'available_space'. In this case available_space has not changed. 

For unmarked nodes, the method 30 proceeds to step 6. Step 6 determines 
whether the subtree defined by 'node' can fit into 'available_space'. If that subtree can fit 
into 'available_space", 'node', together with all the other nodes in that subtree, are 
marked with 'mark' in step 8. As 'node' uses up some of 'available_space' , 
20 'available_space' is decreased by the size of the subtree defined by 'node' in step 9. The 
value of 'available_space' is returned. 

However, if in step 6 the subtree defined by 'node' cannot fit into 
'available_space', then the method 30 continues by trying to split the subtree into smaller 
subtrees. In order to do so, in step 13 which follows, the method 30 verifies whether 
25 'node' is a leaf element (thus have no children nodes) or has been defined as not splitable. 
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In these cases, 'node' cannot be placed into the target, nor can it be spUt, enabling the 
smaller subtree of 'node' to fit into the target. In step 14, the data element represented by 
'node' is discarded and 'node' is marked with a special 'discarded' mark. 

If 'node' is determined to be splitable in step 13, step 17 which follows 
5 determines 'node' has only one child, then step 18 recursively calls the method 30 with 
the child node as 'node'. 

Since leave nodes were filtered out by step 13 and nodes with only one child by 
step 17, 'node' must have two children. When this occurs step 19 recursively calls the 
method 30 with node's first child of as 'node', and then, if there is space left in the target, 
10 which is determined in step 24, the method 30 is called again in step 25 with node's 
second child as 'node'. The method 30 terminates by retuming the available space in the 
galley target. 

The method 30 can be repeated with the next galley target, until all the data 
elements are placed. 

15 The result is that galleys in the input tree are now converted into split trees by 

marking the nodes. Fig. 10 presents a method 40 for generating tree fi-agments fi*om a 
split tree. The method 40 calls procedures 50 and 70, presented in Figs 11 and 12 
respectively. These procedures 50 and 70 are discussed first. 

Procedure 50 is a recursive procedure that finds a next node belonging to a tree 

20 fragment in a split tree. This tree fragment is one of the tree fragments previously defined 
by marking the input tree, thereby obtaining the split tree. Therefore, the tree fragment is 
specified by one of the marks used in the split tree. The procedure terminates once a 
correctly marked node is found. 

Procedure 50 commences with step 41 which accepts two input parameters: 
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• startNode - The node defining the subtree wherein the marked node is searched 
for; and 

• mark - The mark that marks the node being searched for. 

Procedure 50 proceeds to step 43 where it is determined whether 'startNode' is 
5 marked with 'mark'. If so, this is the node that is searched for, and the procedure 50 
terminates in step 44 by returning 'startNode' to its calUng method or procedure. 

If the 'startNode' is not marked with 'mark', step 46 follows which determines 
that StartNode is marked with another mark, such indicates that 'startNode' and all the 
other nodes in its subtree are marked with a mark other than 'mark'. It is not possible for 
10 any node marked with 'mark' to be found in the subtree under review, and the procedure 
50 terminates in step 47 by retuming 'NULL' to its calling method or procedure. 

Therefore, only an unmarked node proceeds to step 48 from where, if 
'startNode' has only one child, procedure 50 is called in step 49 with the child node as 
'startNode'. 

15 If the procedure 50 with the child node returns with a found node, the 

'startNode' belongs to the tree fragment marked with 'mark'. The procedure 50 returns 
'startNode' and terminate in step 52. If the procedure 50 returns with 'NULL', neither 
'startNode' nor any of the nodes in the subtree belong to the tree fragment marked with 
'mark'. Procedure 50 terminates after retuming 'NULL' in step 53. 

20 Therefore, only unmarked nodes with two children proceed to step 54 from 

where procedure 50 is called with the first child as 'startNode'. It is followed by step 55 
from where procedure 50 is called with the second child as 'startNode'. 

If the procedure 50 returns with 'NULL' for both the first child, determined in 
step 56, and the second child, determined in step 57, the calling procedure 50 terminates 

25 after also retuming 'NULL' in step 58. Thus 'mark' has not been found in any one of the 
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children. If 'mark' has been found in one of the children, procedure 50 will terminate 
after returning that child node to the calling method or procedure in step 60 or step 62, 
The remaining altemative is that both children were marked with 'mark' in which case 
the criteria in both step 56 and step 61 are not met. The procedure 50 terminates by 
5 returning 'startNode', being the parent of the two children, at step 63. 

Referring to Fig. 12, procedure 70 for generating nodes of a tree fragment is 
presented. Procedure 70 is invoked recursively to find all the nodes of a tree fragment in 
a split tree and to generate a whole tree fragment. The tree fragment is defined by a 
specific mark. 

1 0 In step 75, the procedure accepts three parameters: 



• node - A node of a tree fragment; 

• mark - The mark that marks the nodes belonging to a tree fragment in a split tree; 
and 

• perform - A fiinction that is invoked for each node belonging to the tree fragment 



If step 76 determines that 'node' is equal to NULL, there are no nodes in that 
subtree marked with 'mark' and step 77 returns the procedure to its calling method or 
procedure. 

However, if 'node' is not equal to NULL in step 76 then step 80 invokes the 
20 'perform' function. This caller-defined procedure performs required operations on the 
nodes of the 'mark' tree fragment. This may include saving the result to a file, modifying 
the attributes of the nodes or displaying the result in a tree form on a user interface. 

If step 82 determines that 'node' has no children (a leave node), the procedure 70 
terminates in step 77 and returns to its calling procedure or method. If step 84 determines 
25 that 'node' has only one child, the child subtree is searched in step 86 to find the next 



15 



defined by 'mark'. 
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node belonging to the tree fragment presently being generated. After finding the next 
node marked with 'mark', procedure 70 is called again in step 88 with this found next 
node as 'node'. Procedure 70 calling itself recursively will continue adding new child 
nodes to the generated tree until there are no more nodes in the subtree defined by the 
child node marked with mark. 

If 'node' has two children the procedure 70 will proceed to step 90 there, the 
children subtrees are treated in the same way as a single child in that steps 90 and 91 will 
find all the nodes in the subtree defined by the first child and generate a tree until there 
are no more nodes in the subtree defined by the first child node marked with mark. Steps 
94 and 95 repeat identical steps for the second child. The procedure 70 terminates in step 
77. 

In the preferred embodiment of the invention, method 40, illustrated in Fig. 10, 
starts in step 100 by accepting three parameters: 

• split_tree - the root node of the split tree; 

• mark - the mark that marks the nodes belonging to a tree fragment in a split tree; 
and 

• perform - The fimction that is invoked for each node belonging to the 'mark' tree 
fragment. 

Method 40 proceeds to step 103 where a root node of a tree fragment in a split 
tree is found. This is performed by calling procedure 50, which will retum the first node 
found with 'mark' as its marking. Procedure 50 may retum 'NULL' if the split tree does 
not contain a tree fragment marked with 'mark'. 

Having established the root node in step 103, step 105 calls procedure 70 passing 
the root node as 'node'. As discussed above, procedure 70 is a recursive procedure for 
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iterating over all the nodes of the tree fragment in the split tree, finding the correctly 
marked nodes and creating the required tree fragment. Method 40 terminates in step 106. 

The foregoing preferred methods describe only a number of embodiments of the 
present invention, and modifications, can be made thereto without departing from the 

5 scope of the present invention as defined in the appended claims. 

The method 30 of marking an input tree to create a split tree and method 40 of 
creating tree fragments from the split tree, are preferably practiced using a conventional 
general-purpose computer system 200, such as that shown in Fig. 13 wherein the 
processes of Figs. 9 to 12 may be implemented as software, such as an application 

10 program executing within the computer system 200. In particular, the steps of methods 
30 and 40 are effected by instructions in the software that are carried out by the computer. 
The software may be stored in a computer readable medium, including the storage 
devices described below, for example. The software is loaded into the computer from the 
computer readable medium, and then executed by the computer. A computer readable 

15 medium having such software or computer program recorded on it is a computer program 
product. The use of the computer program product in the computer preferably effects an 
advantageous apparatus for 30 for marking an input tree to create a split tree and for 
creating tree fragments from the split tree in accordance with the embodiments of the 
invention. 

20 The computer system 200 comprises a computer module 202, input devices such 

as a keyboard 210 and mouse 212, output devices including a printer 208 and a display 
device 104. 

The computer module 202 typically includes at least one processor unit 214, a 
memory unit 218, for example formed from semiconductor random access memory 
25 (RAM) and read only memory (ROM), input/output (I/O) interfaces including a video 
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interface 217, and an I/O interface 216 for the keyboard 210 and mouse 212. A storage 
device 224 is provided and typically includes a hard disk drive 226 and a floppy disk 
drive 228. A magnetic tape drive (not illustrated) may also be used. A CD-ROM 
drive 220 is typically provided as a non- volatile source of data. The components 214 
5 to 228 of the computer module 202, typically communicate via an interconnected bus 230 
and in a manner which results in a conventional mode of operation of the computer 
system 200 known to those in the relevant art. Examples of computers on which the 
embodiments can be practised include IBM-PC's and compatibles. Sun Sparcstations or 
alike computer systems evolved therefrom. 

10 Typically, the application program of the preferred embodiment is resident on 

the hard disk drive 226 and read and controlled in its execution by the processor 214. 
Intermediate storage of the program may be accomplished using the semiconductor 
memory 218, possibly in concert with the hard disk drive 226. In some instances, the 
application program may be supplied to the user encoded on a CD-ROM or floppy disk 

15 and read via the corresponding drive 220 or 228. Still further, the software can also be 
loaded into the computer system 200 from other computer readable medium including 
magnetic tape, a ROM or integrated circuit, a magneto-optical disk, a radio or infra-red 
transmission charmel between the computer module 202 and another device, a computer 
readable card such as a PCMCIA card, and the Internet and Intranets including email 

20 transmissions and information recorded on websites and the like. The foregoing is merely 
exemplary of relevant computer readable mediums. Other computer readable mediums 
may be practiced without departing from the scope and spirit of the invention. 

The methods of marking an input tree to create a split tree and of creating tree 
fragments from the split tree may alternatively be implemented in dedicated hardware 
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such as one or more integrated circuits performing the functions or sub functions of 
processes 30 and/or 40. 

Industrial Applicability 

It is apparent from the above that the embodiments of the invention are 
5 applicable to computer and data processing industries. 

The foregoing describes only some embodiments of the present invention, and 
modifications and/or changes can be made thereto without departing from the scope and 
spirit of the as be illustrative and not restrictive. 

In the context of this specification, the word "comprising" means "including 
10 principally but not necessarily solely" or "having" or "including" and not "consisting only 
of. Variations of the word comprising, such as "comprise" and "comprises" have 
corresponding meanings. 
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Claims: 



10 



15 



20 



1 . A method of creating a split tree for representing an input tree and at 
least one tree fragment, wherein said input tree comprises a plurality of nodes, said 
method comprising the steps of: 

determining which of said plurality of nodes fit into a galley target; and 
marking said nodes that fit into said galley target with a mark to create one said 
tree fragment for each galley target, said marked input tree being said split tree. 

2. A method of splitting a split tree into at least one tree fragments, 
wherein each one of said tree fragments is associated with a mark and said split tree 
comprises at least one node marked said mark, said method comprising the steps of: 

identifying said at least one node with said mark; and 

creating said at least one tree fragment from said nodes marked with each of 
said marks. 

3. Apparatus for creating a split tree for representing an input tree and at 
least one tree fragment, wherein said input tree comprises a plurality of nodes, said 
apparatus comprises: 

means for determining which of said plurality of nodes fit into a galley target; 

and 

means for marking said nodes that fit into said galley target with a mark to 
create said tree fragment for each galley target, said marked input tree being said split 
tree. 
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4. Apparatus for splitting a split tree into at least one tree fragments, 
wherein each one of said tree fragments is associated with a mark and said split tree 
comprises at least one node marked said mark, said apparatus comprises: 

means for identifying said at least one node with said mark; and 
5 means for creating said at least one tree fragment from said nodes marked with 

each of said marks. 

5. A computer program product including a computer readable medium 
incorporating a computer program for creating a split tree for representing an input tree 

10 and at least one tree fragment, wherein said input tree comprises a plurality of nodes, 
said computer program product comprising: 

means for determining which of said plurality of nodes fit into a galley target; 

and 

means for marking said nodes that fit into said galley target with a mark to 
1 5 create said tree fragment for each galley target, said marked input tree being said split 
tree. 

6. A computer program product including a computer readable medium 
incorporating a computer program for splitting a split tree into at least one tree 

20 fragments, wherein each one of said tree fragments is associated with a mark and said 
split tree comprises at least one node marked said mark, said computer program product 
comprising: 

means for identifying said at least one node with said mark; and 
means for creating said at least one tree fragment from said nodes marked with 
25 each of said marks. 
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7. A method of creating a split tree for representing an input tree and at 
least one tree fragment, said method substantially as described herein with reference to 
Figs 7 to 13. 

5 

DATED this twenty-first Day of June, 1 999 

Canon Kabushiki Kaisha 
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